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Abstract: Human coronavirus OC43 (HCoV-OC43) is one of the causes of the “common cold” in 
human during seasons of cold weather. The primary function of the HCoV-OC43 nucleocapsid 


protein (N protein) is to recognize viral genomic RNA, which leads to ribonucleocapsid formation. 
Here, we characterized the stability and identified the functional regions of the recombinant HCoV- 
OC43 N protein. Circular dichroism and fluorescence measurements revealed that the HCoV-OC43 
N protein is more highly ordered and stabler than the SARS-CoV N protein previously studied. 
Surface plasmon resonance (SPR) experiments showed that the affinity of HCoV-OC43 N protein 
for RNA was approximately fivefold higher than that of N protein for DNA. Moreover, we found that 
the HCoV-OC43 N protein contains three RNA-binding regions in its N-terminal region (residues 
1-173) and central-linker region (residues 174-232 and 233-300). The binding affinities of the 
truncated N proteins and RNA follow the order: residues 1-173-residues 233-300 > residues 174—- 
232. SPR experiments demonstrated that the C-terminal region (residues 301-448) of HCoV-OC43 N 
protein lacks RNA-binding activity, while crosslinking and gel filtration analyses revealed that the 
C-terminal region is mainly involved in the oligomerization of the HCoV-OC43 N protein. This study 


may benefit the understanding of the mechanism of HCoV-OC43 nucleocapsid formation. 
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Introduction 

The strain OC43 of the type II coronavirus family 
(HCoV-O0C43) was first identified in the 1960s and is 
responsible for ~20% of all “common colds” in 
humans.** Although HCoV-OC43 infections are gener- 
ally mild, more severe upper and lower respiratory tract 
infections like bronchiolitis and pneumonia have been 
documented, especially in infants, elderly individuals, 
and immunocompromised patients.”**+ Moreover, there 
have been reports that clusters of HCoV-OC43 infec- 
tions cause pneumonia in otherwise healthy adults.” 
Several studies have reported that both neurotropism 
and neuroinvasion of HCoV are associated with multi- 
ple sclerosis, especially in the case of the OC43 strain.® 
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CoV particles have an irregular shape, which is 
determined by an outer envelope with distinctive, 
“club-shaped.” These peplomers give the virus a 
crown-like (corona) appearance.” The viral genome of 
coronaviruses consists of positive sense, single- 
stranded RNA of ~30 kb. The genome contains several 
genes that encode structural and nonstructural pro- 
teins that are required for progeny virion production." 
The virion envelope surrounding the nucleocapsid con- 
tains the following structural proteins: S (spike) pro- 
tein, M (membrane),and E (envelope). A third glyco- 
protein, HE (hemagglutinin-esterase), is present in 
most class II coronaviruses.*? A helical nucleocapsid 
exists in the center of the viral particle.*°-’* Nucleo- 
capsid protein, the major structural protein of CoVs, 
binds to the viral RNA genome to form the virion 
core, which leads to the formation of a ribonucleopro- 
tein (RNP) complex or to a long helical nucleocapsid 
structure.’*"4 The formation of the RNP is important 
for maintaining the RNA in an ordered conformation 
suitable for replication and transcription of the viral 
genome.’*"5"'” Previous studies have shown that the 
CoV N protein is involved in the regulation of cellular 
processes such as gene transcription, actin reorganiza- 
tion, host cell cycle progression, and apoptosis.**~*" It 
has also been shown to act as an RNA chaperone.** 
Moreover, the N protein is an important diagnostic 
marker and the most “immuno-dominant” antigen in a 
host immune response.’77374 

Previous studies have revealed that the N- and 
C-terminal domains of the CoV N proteins, including 
those of SARS-CoV, murine hepatitis virus (MHV), 
and avian infectious bronchitis virus (IBV), are re- 
sponsible for RNA binding and _ oligomerization, 
respectively.7> 3° The central region of the N protein 
has also been shown to contain an RNA-binding 
region and the primary sites of phosphoryla- 
tion.?83432 Phosphorylation of the N protein has 
been shown to play an important role in virus biol- 
ogy.°334 To clarify the molecular mechanism of 
ribonucleocapsid formation by CoVs, structures of 
truncated fragments of N protein including N-termi- 
nal and C-terminal domains have been pub- 
lished.?°35"37 Despite the conservation of some 
motifs, CoV N proteins from different strains often 
show quite different properties primarily due to their 
low sequence homology.®° For example, Saikatendu 
et al. found that the structures of SARS-CoV and IBV 
N-terminal domains shared several common features, 
in addition to many subtle structural differences.°° 
On the basis of the different crystal packing of these 
two structures, they suggested that the two viruses 
most likely use different modes of oligomeric self- 
association during RNP core formation. 

The N protein of HCoV-OC43, which has a molec- 
ular weight of 50 kDa, is highly basic (pI, 10), and it 
shows strong hydrophilicity.3° The N protein of HCoV- 
OC43 shows only 26-30% amino acids similarity to 
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the N protein of other CoV strains.°° Until now, few 
studies have examined the biochemical properties of 
the HCoV-OC43 N protein. In this study, we charac- 
terized the stability and biochemical properties of the 
recombinant full-length N protein from HCoV-OC43. 
RNA-binding and oligomerization regions of the 
HCoV-O0C43 N protein were identified. 


Results 


Stability studies of recombinant HCoV-OC43 
nucleocapsid protein 

Conformational changes of the HCoV-OC43 N protein 
in response to temperatures were monitored using CD 
spectroscopy. As shown in Figure 1(A), the CD spectra 
of HCoV-OC43 N protein were scanned from 190 to 
250 nm at 20, 45, 60, and 90°C. Analysis by the 
SELECON program revealed that HCoV-OC43 N 
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Figure 1. (A) CD spectra of HCoV-OC43 N protein at 
various temperatures including 20, 45, 60, and 90°C. The 
protein concentration was 5 ww and the buffer consisted of 
50 mM Tris-HCl (pH 7.3), 150 mM NaCl, and 0.1% CHAPS. 
(B) Thermostability measurements of HCoV-OC43 N protein 
monitored by CD spectra. The protein concentration was 
3.8 WM and the buffer consisted of 50 mM Tris-HCl (pH 
7.3), 150 mM NaCl, and 0.1% CHAPS. 
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protein contains ~35.1% o-helices, 11.6% f-sheets, 
23.8% turns, and 29.5% random coils at room temper- 
ature. At high temperatures, the CD spectra of HCoV- 
OC43 N protein showed decreased intensity between 
210 and 220 nm and a large negative peak at 190 nm, 
which are characteristic of disordered protein. Previ- 
ous CD spectra of the full-length N proteins from 
SARS and 229E CoVs indicated that the N protein is 
relatively disordered and composed of ~50% nonse- 
condary structures.’” Our present results suggest that 
the conformations of HCoV-OC43 N proteins are more 
ordered than those of HCoV-229E and SARS-CoV N 
proteins [Fig. 1(A)]. We also measured the melting 
temperature (T,,) of HCoV-OC43 N proteins using CD 
[Fig. 1(B)]. Heat denaturation analysis showed that the 
Tm of HCoV-OC43 N protein is 52°C under our condi- 
tions. The thermal denaturation of HCo-OC43 N pro- 
tein is reversible (data not shown). 

Denaturant reagents including urea were used to 
measure the denaturation of HCoV-OC43 N protein. 
We used tryptophan (Trp) fluorescence to monitor 
unfolding because the HCoV-OC43 N protein contains 
five Trp residues [Fig. 2(A)]. The fluorescence emis- 
sion spectra showed the maximal emission wavelength 
of the HCoV-OC43 N protein at ~340 nm. Urea- 
induced unfolding caused a significant decrease in FL 
intensity at 340 nm and a red shift in the maximal 
emission wavelength. The unfolding of HCoV-OC43 N 
protein in the presence of urea was observed by moni- 
toring fluorescence spectra at 340 nm, and the denatu- 
ration curve was fitted to a two-state model [Fig. 
2(B)].2° The denaturation and renaturation curves 
were completely superimposable (data not shown). 
The denaturant molarity at the midpoint of the transi- 
tion (called C,,) was calculated as 4 M and the unfold- 
ing free energy change (AG°y_y) was determined from 
the linear extrapolation method to be 14.5 kJ/mol 
(Supporting Information Figure $1).°? 


Studies of nucleic acid binding by HCoV-OC43 
nucleocapsid proteins 

To measure the binding affinity between the HCoV- 
OC43 nucleocapsid protein and single-stranded nucleic 
acids, HCoV-OC43 nucleocapsid protein was allowed 
to interact with biotin-labeled single-stranded RNA or 
DNA, and the results were monitored by SPR. 
Depending on the virus strain, there are two to four 
UCUAA pentanucleotide repeats, with the last repeat 
being UCUAAAC and termed the intergenic (IG) 
sequence at the 3/ end of the leader.*°* Previous 
studies showed that HCoV N protein has high affinity 
for the intergenic sequence.***? Therefore, the 
repeated intergenic sequence of HCoV-OC43, 5/- 
(UCUAAAC)4-3’, was used as a probe in our SPR and 
fluorescence experiments. For DNA, uracil was 
replaced by thymine. The results show that HCoV- 
OC43 N protein was able to interact with both DNA 
and RNA. However, HCoV-OC43 N protein showed a 
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Figure 2. (A) Fluorescence spectra of HCoV-OC43 N 
protein in Tris-HCI supplemented with different 
concentrations of urea. The protein concentration was 1 WV 
and the buffer consisted of 50 mM Tris-HCl (pH 7.3), 

150 mM NaCl, and 0.1% CHAPS. (B) Urea-induced 
unfolding of HCoV-OC43 N protein monitored by 
fluorescence emission at 340 nm. The protein concentration 
was 1 WV and the buffer consisted of 50 mM Tris-HCl 

(pH 7.3), 150 mM NaCl, and 0.1% CHAPS. 


higher binding capacity for RNA (~4989 RU) than for 
DNA (~2803 RU) at the same protein concentration 
[Fig. 3(A)]. 

Kinetic experiments were carried out by meas- 
uring the binding affinity between HCoV-OC43 N pro- 
tein and single-stranded RNA or DNA. Four concen- 
trations of N protein were used to determine both the 
k, and kg. Figure 3(B) shows the Biacore SPR traces of 
N protein bound to RNA. The k, and kg were calcu- 
lated from the association and dissociation phases of 
the SPR traces, respectively. The k, and kg of RNA 
were 4.82 x 10 Ms“ and 5.75 x 10 4s“, respec- 
tively. The kinetic parameters of DNA showed a lower 
k, and a higher kg of ~2.85 x 104 M's‘ and 1.69 x 
10 ° s*, respectively. The dissociation constants (Ka) 
of HCoV-OC43 N protein with target nucleic acid 
strands were calculated as ka/k, (in M). The Kg of the 
N protein bound to RNA was 11.9 nM, which is close 
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Figure 3. (A) Binding capacity sensorgrams of HCoV-OC43 
N protein (0.7 WV) bound to single-stranded RNA or single- 
stranded DNA. The RNA and DNA sequences immobilized 
on the SA chip were 5/-(UCUAAAC)4-3' and 5’- 
(TCTAAAC)4-3’', respectively. (B) Sensorgrams of HCoV- 
OC43 N protein bound to single-stranded RNA. HCoV- 
OC43 N protein was exposed to RNA at concentration of 
0.7, 0.35, 0.175, and 0.0875 uM. The buffer consisted of 50 
mM Tris-HCl (pH 7.3) containing 150 mM NaCl and 0.1% 
CHAPS. The association rate increased with an increasing 
concentration of target N protein and the dissociation rate 
was independent of concentration. The fitted data was 
shown in Supporting Information (Figure S3). 


to the previously reported Kg of ~14 nM for the 
interaction between MHV N protein and RNA.* As 
expected, the Kg for N protein and DNA was approxi- 
mately fivefold higher than that for N protein and 
RNA. 


Identification of the RNA-binding region of 
HCoV-OC43 nucleocapsid protein 

According to the PONDR prediction, HCoV-OC43 N 
protein contains two structural domains located at the 
N-terminal and C-terminal regions, and between them 
lies a central-linker region that is predicted to be dis- 
ordered [Fig 4(A)].°° To further isolate the RNA-bind- 
ing region of HCoV-OC43 N protein, we expressed and 
purified the truncated forms of the N protein: one 
construct contained only the N-terminal domain (N1— 
173), another contained only the C-terminal domain 
(N301-448), and two contained different parts of the 
central-linker region (N174-232 and N233-300). 
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These domain boundaries are based on the PONDR 
predictions [Fig. 4(A)]. The binding of each truncated 
N protein to RNA was analyzed by SPR [Fig. 4(B)]. 
These studies showed that the N-terminal domain 
(Ni-173) and the central-linker region (N174—232 and 
N233-300), but not the C-terminal domain, bound to 
RNA [Fig. 4(B)]. The kinetic constants for association 
(k, in M* s*) and dissociation (kg in s_*) of the trun- 
cated N proteins with respect to singled-stranded RNA 
have been calculated and listed in Table I. The k, val- 
ues were essentially the same: 2.18 x 104 and 2.48 x 
10+ M* s* for N1-173 and N233-300, respectively, 
and 7.97 x 10° M* s * for N174-232. Moreover, the 
kqa values of Ni-173 and N233-300 were essentially 
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Figure 4. (A) Order/disorder prediction for HCoV-OC43 N 
protein using the VL3-BA predictor of the PONDR program. 
(B) Sensorgrams of the truncated forms of HCoV-OC43 N 
protein including N1-173, N174-232, N233-300, and N301- 
448 bound to single-stranded RNA. Protein was tested at 
concentrations of 2 .WV in buffer containing 50 mM Tris-HCl 
(pH 7.3), 150 mM NaCl, and 0.1% CHAPS. The RNA 
immobilized on the SA chip was 5’-(UCUAAAC)4-3’. In 
N233-300, bulk effects caused by the differences in the 
refractive index of the running buffer and sample solution 
caused a sharp descent in the curves from 140 s to 142 s; 
hence, the dissociation measurements were started at 

142 s (marked by the arrow). 
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Table I. The Kinetic Parameters of the Binding of Full-Length and Truncated N Proteins 


to Single-Stranded RNA, 5’-(UCUAAAC)4-3' 


Construct ka (M™* s~*) ka (s~1) *Ka (M) 
Full-length 4.82 + 0.21 x 10+ 5.75 + 0.23 x 10 + 11.9 + 1.0 x 10 ? 
Ni-173 2.18 + 0.34 x 104 1.5 + 0.21 x 10 68.9 + 8.4 x 10 ° 
N174-232 7.97 + 0.62 x 10? 3.53 + 0.32 x 10 3 44.4 + 3.7 x 10° 
N233-300 2.48 + 0.28 x 10+ 1.58 + 0.24 x 10 3 64.1 + 7.2 x 10 ° 
N301-448 Be a a 


* Ka values were obtained by dividing kg by k,. 
> No significant binding was observed. 


the same: 1.50 x 10 ? and 1.58 x 10 ° s *, respec- 


tively. These constants were smaller than that of 
N174-232 (3.53 x 10 ° s*). The dissociation con- 
stants (Ka) of all truncated N proteins with RNA were 
calculated as kg/k, (in M) and they follow the order: 
N174-232 > N1-173-N233-300. Our results suggest 
that the N protein of HCoV-OC43 contains three RNA- 
binding regions. 


Oligomerization of the HCoV-OC43 
nucleocapsid protein 

The self-association characters of SARS-CoV and 
HCoV-229E N proteins has been reported previ- 
ously.’”44 We further characterized the oligomerization 
of full-length HCoV-OC43 N proteins using crosslink- 
ing assays. As shown in Figure 5(A), HCoV-OC43 N 
protein was able to form dimers, trimers, tetramers, 
and higher-order multimers as the glutaraldehyde con- 
centrations was increased. To evaluate whether elec- 
trostatic interactions stabilize the compact conforma- 
tions of higher-order multimers, the oligomeric 
behaviors of N protein at different ionic strengths 
were studied. Briefly, the purified full-length N 
proteins, which were in a buffer containing sodium 
chloride, were cross-linked by 0.04% glutaraldehyde. 
In addition, we analyzed the oligomerization of N pro- 
tein using analytical gel filtration chromatography 
[Fig. 5(B)]. N protein eluted with a retention volume 
of ~49.5 mL, corresponding to an apparent molecular 
weight of 621 kDa, based on the standard curve. This 
implies an oligomer containing ~12 molecules. 


Identification of the oligomerization region of 
HCoV-OC43 nucleocapsid protein 

To determine the regions that are of greatest impor- 
tance for HCoV-OC43 N protein oligomerization, sev- 
eral regions of N protein were purified and tested 
using crosslinking assays and analytical gel filtration. 
As shown in Figure 6(A), crosslinking studies of 
N233-448 detected dimers, tetramers, and oligomers. 
A shorter form of the protein, N301—448, also formed 
dimers, trimers, and oligomers [Fig. 6(B)]. Gel filtra- 
tion analysis showed a peak of N301-448 at high 
molecular weight (225 kDa), consistent with a ~12- 
mer [Fig. 6(C)]. The crosslinking assay was also con- 
ducted to investigate oligomerization of the N-terminal 
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(N1-173) and middle segment (N174—232) of N pro- 
tein. Both truncated N proteins remained as mono- 
mers under these experimental conditions [Fig. 7], 
suggesting that these regions are not involved in the 
oligomerization. Our data showed that the 148 amino 
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Figure 5. (A) Crosslinking assay of full-length N protein in 
the presence of various concentrations of glutaraldehyde. 
The concentration of the protein was 2 \M. (B) Gel filtration 
chromatography-based analysis (Superdex 300 XK16/70) of 
N protein dissolved in 50 mM Tris-HCl (pH 7.3), 150 mM 
NaCl, and 0.1% CHAPS. The concentration of the target 
protein was 3 WV. Standard proteins of 669 kDa, 158 kDa, 
67 kDa, and 43 kDa are denoted as 1, 2, 3, and 4, 
respectively. 
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Figure 6. (A) Crosslinking assay of N233-448 in the 
presence of glutaraldehyde at various concentrations. The 
concentration of target protein was 2 WV. (B) Crosslinking 
assay of N301-448 in the presence of the indicated 
concentrations of glutaraldehyde. The concentration of 
target protein was 3 WM. (C) Gel filtration chromatography 
based analysis (Superdex 200 XK16/70) of N301-448 
buffered by 50 mM Tris-HCl with 150 mM NaCl and 0.1% 
CHAPS at pH 7.3. The concentration of target protein was 
5 uM. The protein markers of 669 kDa, 223 kDa, 67 kDa, 
and 43 kDa are denoted as 1, 2, 3, and 4, respectively. 
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acids at the C-terminus were the oligomerization do- 
main of HCoV-0C43 N protein. 


Discussion 

We have previously analyzed the full-length SARS-CoV 
and HCoV-229E N proteins by CD and found them to 
be relatively disordered, showing nearly 50% disor- 
dered structures.’” Here, we showed that the HCoV- 
OC43 N protein possesses a more ordered structure 
than other human coronavirus N proteins, including 
those of SARS and 229E, which feature 35% a—helix 
content (Supporting Information Figure S2). Unlike 
SARS-CoV and HCoV-229E N proteins, the recombi- 
nant HCoV-0C43 N protein is highly resistant to pro- 
teolysis..7 According to previous heat and urea 
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Figure 7. (A) Crosslinking assay of N1-173 in the presence 
glutaraldehyde at various concentrations. The concentration 
of target protein was 2 WV. (B) Crosslinking assay of 
N174-232 in the presence of the indicated concentrations 
of glutaraldehyde. The concentration of target protein 

was 1.5 uM. 
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denaturation analyses, the T,, and C,, values of SARS- 
CoV N protein are 38°C and 2.77 M, respectively, 
implying a low stability of the SARS-CoV N protein.*® 
In this work, the T,, and C,, of the HCoV-OC43 N pro- 
tein were found to be 52°C and 4 M, respectively. Our 
results suggest that the HCoV-OC43 N protein is more 
stable than the SARS-CoV N protein.*° 

In this study, we have shown that the recombi- 
nant HCoV-OC43 N protein expressed in E. coli binds 
to the intergenic sequence of the RNA genomic leader. 
This binding demonstrates that the obtained recombi- 
nant N protein was properly folded. We examined the 
binding of recombinant HCoV-OC43 protein for 
nucleic acids. Our kinetic data showed that HCoV- 
OC43 N protein binds preferentially to single-stranded 
RNA over the corresponding single-stranded DNA. 
The decreased affinity for DNA may be due to two fac- 
tors. First, the 2’-hydroxyl group of ribose in RNA 
may form hydrogen bonds with residues of the N pro- 
tein. Second, in DNA, the methyl group on the C5 of 
thymine may create a steric clash with residues within 
the RNA-binding groove. The Hiscox group suggested 
that phosphorylation of the N protein determined the 
recognition of virus RNA because phosphorylation 
may alter the conformation of N protein, and thus 
affect its RNA-binding In addition, 
Mohandas et al. have reported that dephosphorylation 
of MHV N protein by cellular phosphoprotein phos- 
phatase may facilitate the infectious process.*° How- 
ever, the role of phosphorylation in the RNA-binding 
sites of HCoV-OC43 N protein was not investigated in 
our study because the N protein was expressed in E 
coli. It would be interesting to study the effects of 
phosphorylation on the conformation and RNA-bind- 
ing properties of HCoV-OC43 N protein in the future. 

Previous studies have reported that the N-termi- 
nus of SARS-CoV N protein provides a scaffold for 
RNA binding.?%3°47 X-ray analysis revealed that the 
fold of the N-terminal domain of N protein is essen- 
tially conserved across the various CoV groups. It has 
a U-shaped structure with the two arms rich in basic 
residues and the flexible loops well-ordered around 
the f-sheet core of the N-terminal domain.?°3¢ 
Spencer and Hiscox found that the N-terminal region 
of IBV protein facilitates long-range, nonspecific inter- 
actions between N protein and viral RNA, thus leading 
to the formation of the ribonucleocapsid via a lure and 
lock mechanism.*? Similar to the N-terminal domain 
of SARS-CoV and IBV N proteins, the N-terminal do- 
main of HCoV-OC43 N protein contains many posi- 
tively charged residues and has been shown by SPR to 
be responsible for RNA binding. In addition to the N- 
terminal domain (N1-173), two highly positively 
charged portions of the central linker region, N174— 
232 and N233-300, also show RNA-binding activity. 
Given their high pI values, these RNA-binding regions 
may rely on electrostatic interactions to interact with 
RNA. Compared with Ni-173 and N233-300, N174- 


sites.173334 


Huang et al. 


232 has lower RNA-binding affinity and, therefore, 
appears to play an auxiliary role in RNA binding, de- 
spite the presence of a serine/arginine motif (SR-rich 
motif) that is usually involved in RNA recognition.*® 
In a previous study, the SR-rich motif of MHV N pro- 
tein was shown to be the RNA-binding region.?°4? In 
addition, the C-terminal domain of SARS-CoV N pro- 
tein has been proposed to bind nucleic acids and to 
participate in genomic RNA-binding.*>49 However, 
our studies suggest that the C-terminal domain 
(N301-448) of the HCoV-OC43 N protein does not 
bind RNA. 

A previous study showed that SARS-CoV N pro- 
tein forms high-order oligomers through the action of 
its C-terminal 138 amino acids. These oligomers exist 
predominantly as dimers.*+°° The N-terminal domain 
allows for the association of dimers to form higher- 
order oligomers. Crystal structures of the C-terminal 
domain of SARS-CoV and IBV N proteins show a simi- 
lar general polypeptide fold, which strongly suggests 
that the dimerized N protein is the functional unit in 
vivo for the four groups of coronaviruses.*”°° The 
crystal structure of the C-terminal domain shows a 
tightly intertwined, twofold symmetric C-terminal do- 
main dimer with a B-hairpin ($1 and B2) from one 
subunit extending into the cavity of the opposite subu- 
nit; this forms an antiparallel B-sheet with hydrogen 
bonds across the dimer interface. Luo et al.4? have 
reported that this dimer association does not involve 
electrostatic forces. These authors have further sug- 
gested that the SR-rich motif of the SARS-CoV N pro- 
tein binds to the central region and thereby triggers 
the multimerization of dimers.** In addition, Fan et al. 
showed that the N-terminal region of the IBV N pro- 
tein is involved in the oligomerization of N protein.?° 
Here, we analyzed the self-association properties of 
the HCoV-OC43 N protein and found it to form 
oligomers. We identified the C-terminal 148 amino 
acids of HCoV-OC43 N protein to be the oligomeriza- 
tion domain. When expressed in isolation, this region 
of the C-terminus formed oligomers at only micromo- 
lar concentrations, and this oligomerization occurred 
in the absence of the N-terminal domain. The other 
truncated constructs of HCoV-OC43 N protein, Ni- 
173 and N174-232, were found to remain as mono- 
mers under all of the tested experimental conditions, 
suggesting that these two regions are not involved in 
oligomerization during ribonucleocapsid formation. 

In summary, we found the HCoV-OC43 N protein 
exhibited specific properties compared with other 
CoVs. For example, HCoV-OC43 N protein is more 
highly ordered and stabler than the SARS-CoV protein 
previously studied.’”*° We also showed that HCoV- 
OC43 N protein is composed of three RNA-binding 
regions lying within the N-terminal region (residues 
1-173) and central-linker region (residues 174-232 
and 233-300) that bind to RNA in a cooperative man- 
ner. Although the C-terminal region (N301—448) lacks 
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RNA-binding activity, this region plays an important 
role in the oligomerization of the HCoV-OC43 N pro- 
tein. This study may benefit the development of drugs 
to disrupt the binding of viral N protein with RNA 
and viral assembly. 


Materials and Methods 

Drugs and reagents were purchased from Sigma 
Chemical Co. All oligoribonucleotides and oligodeoxyr- 
ibonucleotides were synthesized using an automated 
DNA synthesizer and purified by gel electrophoresis. 
The biotin-linked oligomers were synthesized by incor- 
porating the biotin synthon at the 5/-end of the 
oligomers; oligomers were then immobilized to the 
streptavidin-coated biosensor chip used for the surface 
plasmon resonance (SPR) experiments. 


Expression and purification of the full-length 
and truncated N proteins 

The templates for the HCoV-OC43 N protein were pro- 
vided by the Institute of Biological Chemistry, Aca- 
demia Sinica (Taipei, Taiwan). To generate both full- 
length and truncated forms of the recombinant N pro- 
teins, the N protein gene was amplified by polymerase 
chain reaction (PCR) from plasmid pGENT using vari- 
ous primers. The PCR products were digested with 
NdeI and XhoI, and the DNA fragments were cloned 
into pET28a (Novagen) using T4 ligase (NEB). Bacte- 
ria transformed with the resultant plasmid were grown 
in culture. Protein expression was induced by supple- 
menting the culture medium with IPTG to 1 mM fol- 
lowed by incubation at 10°C for 24 h. After harvesting 
the bacteria by centrifugation (3500g, 30 min, 4°C), 
the bacterial pellets were lysed with lysis buffer (50 
mM Tris-buffered solution, pH 7.3, 150 mM NaCl, 
0.1% CHAPS, and 15 mM imidazole). Soluble proteins 
were obtained from the supernatant following centrifu- 
gation (15,000 rpm, 30 min, 4°C) to remove the pre- 
cipitate. Full-length and truncated N proteins carrying 
a His6-tag at their N-termini were purified using a Ni- 
NTA column (Novagen) with an elution gradient rang- 
ing from 15 to 300 mM imidazole. The pure fractions 
were collected and dialyzed against low-salt buffer. 
Since the N protein is positively-charged, it was fur- 
ther purified by SP cation exchange chromatography 
using a gradient from 0.05M to 1.5M NaCl in 50 mM 
Tris at pH 7.3, 0.1% CHAPS. The protein concentra- 
tions were determined using the Bradford method 
with Bio-Rad protein assay reagents. 


Circular dichroism spectroscopy 

Circular dichroism (CD) spectra were obtained using a 
JASCO-815 CD spectropolarimeter. Temperature was 
controlled by circulating water at the desired tempera- 
ture in the cell jacket. Each protein were dissolved in 
50 mM Tris-buffered solution, pH 7.3, 150 mM NaCl, 
and 0.1% CHAPS. The CD spectra were collected 
between 250 and 190 nm with 1 nm bandwidth at 
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1 nm intervals. All spectra were obtained from an aver- 
age of five scans. The photomultiplier absorbance did 
not exceed 600 V during the analysis. CD spectra were 
normalized by subtraction of the background scan 
with buffer alone. The mean residue ellipticity, [0], 
was calculated based on the equation, [O] = 
MRWx@A/10x/xc, where MRW is the mean residue 
weight, @A is the measured ellipticity in milidegrees at 
wavelength A, / is the cuvette pathlength (0.1 cm), and 
c is the protein concentration in g/mL. The results 
were analyzed using the SELCON3 program to calcu- 
late the percentage of each type of secondary struc- 
ture.>* In addition, T,, was determined from the poly- 
nomial fitting of the observed curve and taken as the 
temperature corresponding to half denaturation of the 
N protein. The first derivative of absorption with 
respect to temperature, dA/dT, of the melting curve 
was computer-generated and used for determining 
the Ty. 


Fluorescence spectroscopy 

In the urea-induced unfolding experiment, N protein 
was added to a final concentration of 1 1M to buffer 
containing various concentrations of urea, and the 
samples were incubated at 25°C for 30 min. The buffer 
consisted of 50 mM Tris, pH 7.3, and 0.1% CHAPS. 
Tryptophan fluorescence measurements were made 
using an F-4500 fluorescence spectrophotometer 
(Hitachi) equipped with a cuvette with a 1 cm light 
path. The excitation wavelength was 288 nm, and 
the emission data were collected between 300 and 
400 nm. 


Surface plasmon resonance binding 
experiments 

The affinity, association, and dissociation of N proteins 
and RNA were measured in a BlIAcore 3000A SPR 
instrument (Pharmacia, Uppsala, Sweden) equipped 
with a SensorChip SA5 from Pharmacia; the apparatus 
measured binding by monitoring the refractive index 
change of the sensor chip surface. These changes, 
recorded in resonance units (RU), are generally 
assumed to be proportional to the mass of the mole- 
cules bound to the chip. The surface was first washed 
three times by injecting 10 pL of 100 mM NaCl solu- 
tion with 50 mM NaOH. To control the amount of 
RNA (or DNA) bound to the SA chip surface, the bio- 
tinylated oligomer was immobilized manually onto the 
surface of a streptavidin chip until a signal of 1200 RU 
was achieved in the first cell. The chip surface was 
then washed with 10 pL of 10 mM HCl to eliminate 
non-specific binding. The second flow cell was 
unmodified and served as a control. The appropriate N 
proteins were dissolved in 50 mM Tris (pH 7.3) with 
150 mM NaCl and 0.1% CHAPS, and passed over the 
chip surface for 140 s at a flow rate of 30 L/min to 
reach equilibrium. Blank buffer solution was then 
passed over the chip to initiate the dissociation 
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reaction; this step was continued for an additional 
600 s to complete the reaction. After 600 s, the sur- 
face was recovered by washing with 10 pL of 0.1% SDS 
for each single-stranded RNA (or DNA). Before fitting 
to the 1:1 Langmuir model, binding data were cor- 
rected by subtraction of the control to account for 
simple refractive index differences. Sensorgrams for 
interactions between RNA (or DNA) and protein were 
analyzed using BIA evaluation software (version 3). To 
verify if mass transport effects may arise during the 
interaction between N protein and immobilized RNA, 
binding experiments were performed at different flow 
rates and different surface binding capacity. It was 
found that the binding interaction between N protein 
and RNA is not a mass transport-limited process 
because the derived kinetic parameters (especially the 
association rate constant, k,) are independent of the 
flow rate and binding capacity. 


Chemical crosslinking assay 

To investigate the oligomerization features of N pro- 
teins, a chemical crosslinking experiment was per- 
formed. A series of protein solutions containing N pro- 
teins was supplemented with various concentrations of 
glutaraldehyde, and each reaction mixture was incu- 
bated at room temperature for 5 min. The reaction 
was stopped by the addition of 1M Tris. pH 7.3 (0.5%, 
v/v, final concentration) and incubation on ice. The 
sample solution was then analyzed by SDS-PAGE. 


Analytical gel filtration chromatography 

Gel filtration experiments were performed using fast 
protein liquid chromatography (Amersham Bioscien- 
ces) equilibrated with buffer containing 50 mM Tris- 
HCl (pH 7.3), 150 mM NaCl, and 0.1% CHAPS at a 
flow rate of 0.5 mL/min. Blue dextran was used to 
determine of the void volume (V,). Several proteins of 
known molecular weight were used as standards (thy- 
roglobulin, 669 kDa; catalase, 232 kDa; aldolase, 158 
kDa; albumin, 67 kDa; ovalbumin, 43 kDa; chymo- 
trypsinogen A, 25 kDa and ribonuclease A, 13.7 kDa, 
purchased from GE Healthcare), and their elution vol- 
umes (V.) were determined. The standard curve was 
plotted as the logarithm of the molecular weight 
against V./V. for the standard proteins. 
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